feat: reduce CI time#819
Conversation
1. silent camodel 2. reduce inputs 3. deep merge vpto cases 4. build light tilelang dsl smoke test
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive system testing (ST) suite for TileLang operators on the NPU (a5 architecture), including CMake configurations, host drivers, data generation, and comparison scripts for a wide range of operators. It also adds a utility script to prepare a quiet camodel directory to reduce simulator I/O. The reviewer feedback highlights several key improvements: replacing the non-deterministic hash() function with zlib.crc32 for random seeding in st_common.py, opening the lock file in append mode ("a") in prepare_quiet_camodel.py to prevent premature truncation, making the hardcoded Ascend driver path configurable in CMakeLists.txt, and catching ValueError in the softmax comparison script to handle corrupted output files gracefully.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def setup_case_rng(case): | ||
| """Set a per-case deterministic random seed. | ||
|
|
||
| Using hash(name) ensures that adding/reordering cases does not change | ||
| the random data of existing cases. | ||
| """ | ||
| np.random.seed(hash(case["name"]) & 0xFFFFFFFF) |
There was a problem hiding this comment.
Using Python's built-in hash() function for seeding np.random.seed is not deterministic across different runs. In Python 3, hash randomization is enabled by default, meaning hash(case["name"]) will return a different value every time the Python process restarts. To ensure reproducible test data, use a stable hashing algorithm like zlib.crc32.
| def setup_case_rng(case): | |
| """Set a per-case deterministic random seed. | |
| Using hash(name) ensures that adding/reordering cases does not change | |
| the random data of existing cases. | |
| """ | |
| np.random.seed(hash(case["name"]) & 0xFFFFFFFF) | |
| def setup_case_rng(case): | |
| """Set a per-case deterministic random seed. | |
| Using a stable CRC32 hash ensures that adding/reordering cases does not change | |
| the random data of existing cases, and remains deterministic across runs. | |
| """ | |
| import zlib | |
| seed = zlib.crc32(case['name'].encode('utf-8')) & 0xFFFFFFFF | |
| np.random.seed(seed) |
| lock_path = os.path.join(quiet_dir, ".quiet-camodel.lock") | ||
| with open(lock_path, "w", encoding="utf-8") as lock_file: | ||
| fcntl.flock(lock_file, fcntl.LOCK_EX) | ||
| return prepare_quiet_camodel_locked(source_dir, quiet_dir) |
There was a problem hiding this comment.
Opening the lock file in write ("w") mode truncates the file before acquiring the lock. If another process is currently holding the lock or waiting for it, this truncation can cause race conditions or interfere with its operations. It is a safer best practice to open lock files in append ("a") mode to avoid truncating the file.
| lock_path = os.path.join(quiet_dir, ".quiet-camodel.lock") | |
| with open(lock_path, "w", encoding="utf-8") as lock_file: | |
| fcntl.flock(lock_file, fcntl.LOCK_EX) | |
| return prepare_quiet_camodel_locked(source_dir, quiet_dir) | |
| lock_path = os.path.join(quiet_dir, ".quiet-camodel.lock") | |
| with open(lock_path, "a", encoding="utf-8") as lock_file: | |
| fcntl.flock(lock_file, fcntl.LOCK_EX) | |
| return prepare_quiet_camodel_locked(source_dir, quiet_dir) |
| set(PTO_ISA_ROOT "${CMAKE_CURRENT_LIST_DIR}/../../../../../../../../pto-isa" CACHE PATH "Path to pto-isa repo") | ||
| set(PTO_TILELANG_ST_COMMON_DIR | ||
| "${CMAKE_CURRENT_LIST_DIR}/../common") | ||
| set(ASCEND_DRIVER_PATH /usr/local/Ascend/driver) |
There was a problem hiding this comment.
The Ascend driver path is hardcoded to /usr/local/Ascend/driver. If the driver is installed in a non-standard directory, the build will fail. It is better to allow overriding this path via a CMake cache variable or environment variable.
if(NOT DEFINED ASCEND_DRIVER_PATH)
set(ASCEND_DRIVER_PATH /usr/local/Ascend/driver CACHE PATH "Path to Ascend driver directory")
endif()
| except FileNotFoundError as exc: | ||
| print(style_fail(f"[ERROR] {case['name']}: missing file {exc}")) | ||
| return False |
There was a problem hiding this comment.
If any of the output binary files are incomplete or corrupted (e.g., due to a simulator crash), np.fromfile().reshape() will raise a ValueError. Since only FileNotFoundError is caught here, a ValueError will cause the comparison script to crash with a traceback. Catching ValueError along with FileNotFoundError makes the test runner more robust.
| except FileNotFoundError as exc: | |
| print(style_fail(f"[ERROR] {case['name']}: missing file {exc}")) | |
| return False | |
| except (FileNotFoundError, ValueError) as exc: | |
| print(style_fail(f'[ERROR] {case["name"]}: failed to load array {exc}')) | |
| return False |
Codex Review该评论由 review 机器人自动更新。
SummaryReview failed at stage Findings未生成结构化 findings,因为 review 过程提前失败。 Log Tail |
A3 板测失败
失败用例
|
A3 板测失败详情:PR #819orchestration_example_kernel_add
vector_example_dag_kernel_add_scalar
paged_attention_example_kernel_pv_matmul
paged_attention_example_kernel_init_inplace
vector_example_dag_kernel_add
paged_attention_example_kernel_online_update
paged_attention_example_kernel_softmax_prepare
orchestration_example_kernel_add_scalar
paged_attention_example_kernel_qk_matmul
orchestration_example_kernel_mul
vector_example_dag_kernel_mul
prelu
plan_memory_bind_tile_alias_liveness
plan_memory_peak_exact_capacity
plan_memory_loop_no_reuse_outer_live
plan_memory_if_yield
plan_memory_loop_in_if
plan_memory_peak_8_overlapping
plan_memory_if_in_loop
plan_memory_fragmentation_hole_fit
plan_memory_for_iter_args_yield
plan_memory_no_reuse_overlap
plan_memory_reuse_sequential
plan_memory_nested_loops
plan_memory_fragmentation_two_holes
rems
partition_view_verify_rank_mismatch_valid
partition_view_verify_valid
partition5d_dynamic
partition5d
sparse_attn_test_incore_7
decode_hca_test_incore_54
attention_swa_test_incore_40
decode_swa_test_incore_40
decode_csa_test_incore_81
attention_hca_test_incore_54
attention_csa_test_refresh_incore_81
tensor_view_layout_dn
rope_kv_cache
qwen3_decode_incore_4
post_rmsnorm
qwen3_decode_incore_1
qwen3_decode_incore_10
qwen3_decode_incore_11
rmsnorm
qwen3_decode_incore_6
qwen3_decode_incore_2
qwen3_decode_incore_7
qwen3_decode_incore_5
qwen3_decode_incore_12
test_barrier_sync
matmul
add_double_dynamic
nested_loop_confliect
rar_optimization_test
test_dynamic_valid_shape
test_auto_sync_tail_hint
compensation_test
rem
|
Summary
Validation